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ABSTRACT 

The capacity of an organism to respond to its envir- 
onment is facilitated by the environmentally induced 
alteration of gene and protein expression, i.e. 
expression plasticity. The reconstruction of gene 
regulatory networks based on expression plasticity 
can gain not only new insights into the causality of 
transcriptional and cellular processes but also the 
complex regulatory mechanisms that underlie biolo- 
gical function and adaptation. We describe an 
approach for network inference by integrating 
expression plasticity into Shannon's mutual infor- 
mation. Beyond Pearson correlation, mutual infor- 
mation can capture non-linear dependencies and 
topology sparseness. The approach measures the 
network of dependencies of genes expressed in 
different environments, allowing the environment- 
induced plasticity of gene dependencies to be 
tested in unprecedented details. The approach is 
also able to characterize the extent to which the 
same genes trigger different amounts of expression 
in response to environmental changes. We 
demonstrated the usefulness of this approach 
through analysing gene expression data from a 
rabbit vein graft study that includes two distinct 
blood flow environments. The proposed approach 
provides a powerful tool for the modelling and 
analysis of dynamic regulatory networks using 
gene expression data from distinct environments. 



INTRODUCTION 

Network analysis using gene expression data has been 
widely applied as an approach to studying the regulatory 
causality of transcriptional processes involved in cell 
survival and proliferation (1^4). In responding to 
changes in environmental conditions, a functional cell 
would modify the expression of particular genes through 
signalling regulation to make it possible to preserve the 
robustness of cellular processes (5). A comprehensive 
characterization of regulatory networks behind such an 
environment-induced response becomes essential in 
studying how cells adapt and survive under non-ideal con- 
ditions. However, current strategies for network construc- 
tion from gene expression data in a single environment 
are inadequate for our understanding of the complex regu- 
latory mechanisms that underlie biological adaptation 
and function. Furthermore, the static feature of these 
strategies assumes that genes are expressed in a steady 
state, making it infeasible to describe the dynamic 
patterns of an evolving process (6). 

The purpose of this article is to develop a computational 
model for constructing regulatory networks of dynamic 
gene expression in response to environmental changes. 
The difference of expression for the same gene between 
different environments is called expression plasticity 
(7,8). As a new concept, expression plasticity has emerged 
to be useful for studying the constraints for the evolution of 
gene expression in fluctuating environments (9-11). Our 
model for network construction capitalizes on gene expres- 
sion plasticity, aimed at gleaning a better insight into the 
regulatory mechanisms for an organism's adaptation to 
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environmental changes. The model is founded on mutual 
information, a quantity that measures the mutual depend- 
ence of the two random variables, particularly in terms of 
positive, negative and non-linear correlations (12). 

The approach for gene expression analysis with mutual 
information is not entirely new. Michaels et al. (13) 
attempted to cluster dynamic gene expression profiles 
according to information theory. Butte and Kohane (14) 
computed pair-wise mutual information for the expression 
of all genes using a method of discretizating variable 
domains. Steuer et al. (1) described the basic theory of 
mutual information and pioneered its usage to detect 
dependencies of different genes. Priness et al. (15) 
compared the properties of different methods for cluster- 
ing gene expression profiles based on mutual information 
and classic Euclidean distance and Pearson correlation 
measures. A path consistency algorithm has been de- 
veloped to reconstruct gene regulatory networks based 
on conditional mutual information (16). There are 
several applications of information-theoretic approaches 
for network reconstruction in a mammalian cellular 
context (17) and Escherichia coli transcriptional studies 
(18). Meyer et al. (19) packed mutual information into 
an R package minet for inferring transcriptional 
networks from microarray data. Rajapakse et al. (20,21) 
used information theory to reconstructing gene regulatory 
networks during the differentiation of a multipotential 
haematopoietic progenitor. 

Despite these developments, the use of mutual informa- 
tion to reconstruct regulatory networks based on the 
environment-induced plasticity of time-series expression 
profiles has not been explored. The model presented in 
this article will take advantage of mutual information 
in measuring the non-linear dependency of different vari- 
ables to unravel the dynamic changes of network architec- 
ture in a response to the environment. The model was used 
to analyse experimental gene expression data obtained from 
rabbit vein grafts exposed to two different wall shear con- 
ditions, where these different environments resulted in two 
distinct adaptation phenotypes (22,23). The model has been 
validated through a simulation study. By extending the 
model to reconstruct a web of mutual relationships 
among genes and the target phenotype, it provides a 
useful tool for inferring the causality of gene regulation. 

MUTUAL INFORMATION 

Shannon (24) provided a mathematical theory of 
measuring the amount of uncertainty and quantifying 
the theoretical maximum capacity of information by a 
communication system to eliminate such uncertainty. 
This theory, called information theory, has been widely 
applied in a variety of fields. In what follows, we imple- 
ment Shannon's information theory to reconstruct a regu- 
latory network with gene expression plasticity data. 

Expression plasticity entropy 

In mutual information, we view gene expression profiles as 
a discrete random variable. Suppose that expression 
profiles of genome-wide transcriptional genes are 



measured at the same series of time points for the same 
organism that receives two different treatments. For a par- 
ticular gene, the difference of its time-dependent expres- 
sion curve between the two treatments describes the 
pattern of how this gene responds to the change in the 
treatment's environment. Wang et al. (23) have developed 
a dynamic model for clustering genes into distinct groups 
based on the temporal patterns of their expression profiles 
in a relation to specific biological functions. 

Our model being developed here is to construct a regu- 
latory network of these genes in terms of their dynamic 
relationships formed in response to environmental change. 
Mutual information allows the non-linear dependence 
among different genes to be characterized. We define the 
difference of the expression value of the same gene at 
the same time point between the two environments as 
the expression plasticity of this gene (7,8). Let AX 
denote the time-dependent expression plasticity variable 
of a gene at time points {1,..., T}, expressed as 
AX = {A.r b .., Ax T }. 

Suppose that D is the value range of AX, and the 
subinterval set {£>,■}, j = 1, 2, . . . , M, is a partition of D, 
satisfying that Uj{Dj} = D, and Df\D k = cp ifj^k. Note 
that M is the number of subsections partitioned from the 
domain D. For convenience, we denote the partition {Dj\ 
simply as D. Define the delta function as follows, 



where i = 1,2,..., T, and j = 1, 2, . . . , M. Then the prob- 
ability of Dj according to the expression plasticity variable 
AX is defined as, 

1 T 

Pax(Dj) = -J2 s ( Ax " D J)>j = 1,2,..., M. 

i=l 

Based on the probability defined above, in accordance 
with Shannon (24), the entropy of AX with a given parti- 
tion D is defined as 

M 

H D {AX) = - J2pM D j) ^$Pax{Dj), (1) 

7=1 

where the bottom of the logarithmic function, usually 2 or 
e, could be any positive number without changing the 
properties of entropy. In this article, we will use 2, in 
accordance with the definition based on bits by 
Shannon. If p AX (-0/) = 0 in Equation (1), the expression 
PAx{Dj)\ogp A x(Dj) is mathematically undefined. But it 
can be redefined to be its limit 0 when p^xiDj) goes to 
0 from the right side of 0. 

According to Faser and Swinney (25), when the 
measurement is expressed as AX, we can describe the 
expression plasticity entropy H D (AX) as the degree of 
surprise, i.e. the elimination of uncertainty about AX. 
Information entropy has many properties, several of 
which are listed as follows: 

(i) The entropy H D (AX) reaches its minimum 0 if the 
expression plasticity AX as a variable is determined, 
i.e. AX is no longer random. In this case, the 
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probability of one element in {Z>i, . . . , D M ) is 1 and 
that of each of the other elements is 0. 
(ii) If {£>!,..., D M ) are equiprobable, then entropy 
H D (AX) is maximized to the value logM. In this 
case, the entropy H D (AX) is the most uncertain; 
i.e. H D (AX) is the hardest to predict. 

Conditional entropy of expression plasticity 

Analogously to the delta function defined earlier in the 
text, we can also define joint-delta function as follows. 

S(A Xi , Ay u Dj,D k ) 



1 , if Ax,- e Dj and A>v e D k 
0, else; 



where i = 1,2,.. ., T, and j, k = 1, 2,. . ., M. Then the joint 
probability and the conditional probability of {Dj, D k ) 
according to the expression plasticity variable AX and 
A Y are defined as follows, respectively. 

1 T 

Pax,ay{Dj, D k ) =-^(Aa-,-, Ay u D h D k ), j, k=\,2,...,M. 



£S(Ax ( -, Ayi,Dj,D k ) 
PAX\Av{Dj,D k ) =— — — ,j,k = 1,2,. ..,M. 

According to information theory (24), we can calculate 
the conditional entropy of the expression plasticity of 
one gene AX, given the expression plasticity of another 
gene Ay with time-series values {Aj b . . .,Ay T ), which is 
defined as 

M M 

H D (AX\AY) = -J2J2P^M D P D k) fogPAX\AY{Dj,D k ), 

7=1 k=\ 

(2) 

where H D (AX\A Y) is the conditional entropy measuring 
the remaining uncertainty of AX if A Y vis determined, 
which has the following property, 



H D (AX\AY) < H D (AX). 



(3) 



If AX and AY are statistically independent of each 
other, we have 

H D (AX\ AY) — H D (AX). (4) 



Joint entropy of expression plasticity 

The joint entropy of expression plasticity for the 
two genes, H D (AX, A Y), is defined, analogously to 
H D (AX), as 

M M 

H D (AX, AY) = -J2 J^PAX,AY(Dj,D k ) \ogp AX , AY (D h D k ), 

y'=l k=l 

(5) 

where p^x.ayQ^/, Da) is defined based on the expression 
plasticity variables Ax and Ay for the two genes. The joint 



entropy is not greater than the sum of the entropies of two 
expression plasticity variables, i.e. 

H D (AX, AY) < H D (AX) + H D (AY). (6) 

If AX and A Y are statistically independent, we have 

H D (AX, AY) = H D (AX) + H D (A Y). (7) 

The relationship among the entropy, conditional entropy 
and joint entropy is expressed as 



H D (AX, AY) = H D (AX\ AY) + H D (AY). 



(8) 



Equation (8) implies that the uncertainty of the joint 
system (AX, A Y) is the uncertainty of A Y, plus the con- 
ditional uncertainty of AX given A Y. 

Mutual information 

The mutual information between two variables of expres- 
sion plasticity AX and A Y according to a domain parti- 
tion D is defined as 

I D (AX, AY) = H D (AX) + H D ( A Y) - H D (AX, A Y). (9) 
From Equation (6), we have 

I D (AX, AY) > 0. (10) 

Furthermore, from Equation (7), we obtain the conclusion 
that, if AX and AY are statistically independent, their 
mutual information is 0. 

Mutual information is symmetrical, i.e. 



I (AX, AY) — I D (A Y, AX). 



(11) 



In sum, mutual information shown by Equation (9) 
measures the dependency between the expression plasticity 
of two arbitrary genes, no matter the dependency is linear 
or non-linear. 



DISCRETIZATION 

To apply mutual information of expression plasticity, the 
random variable domain must first be partitioned into 
discrete bins. Butte and Kohane (14) used a straightfor- 
ward method of evenly dividing a domain interval into a 
certain number of sub-intervals and then approximating 
the probabilities by the corresponding relative frequencies 
of occurrence. The mutual information by this approach 
depends much on the distribution type of the expression 
plasticity variables and the distribution parameters. 
Schreiber and Schmitz (26) proposed an adaptive partition- 
ing method. Per this method, each resultant sub-interval 
for a random variable contains approximately equal 
number of occurrences. This method is more precise than 
the straightforward one in finding the mutual information. 
In Supplementary Text SI, we illustrate the procedure of 
bin characterization by these two methods. 

The two methods described earlier in the text may not 
produce ideal results when the variable distribution types 
are the same but the distribution parameters are different. 
This is common, especially for gene expression data. 
To improve the calculation of mutual information by 
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Schreiber and Schmitz's (26) method, we partition the 
domains of the two random variables of expression plasti- 
city under consideration according to a common standard, 
while simultaneously making the intervals adaptive to the 
respective data. We call this process 'common adaptive 
partitioning'. Let Ax, and Ajy denote two random vari- 
ables of expression plasticity measured at time t and f, 
respectively, expressed as 



(Ax„Ay f ), t,/=l,...,T. 



(12) 



whose means are denoted as (\i x , (iy) and standard devi- 
ations denoted as (ax^y). Suppose {qo,qi, ,<7t,</t+i} 

is a sequence of real numbers, q 0 = -oo, q T+ \ = oo and 
q, < q t +\ for 1 < t < r— 1. Except for the two infinities, 
the other x parameters are to be determined later, which 
are denoted as 



® = \q\,q2, 



(13) 



The domains of AX and A Y are partitioned by a trans- 
formation of the sequence into the following intervals, 
expressed, respectively, as 

rjf - (iix + axqt-U Hx + vxqt] t = l,...,r, r+ 1 



1 



(14) 



Let kf, k Y f and kf'/ denote the numbers of time- 
dependent expression ' plasticity values from Equation 
(12) located in the rth interval of X, in the t't\i interval 
of Y and in the rth interval of X while simultaneously in 
the r'th interval of Y, respectively. 

Our purpose is to select an optimal parameter set 
described in Equation (13) that makes the time-dependent 
expression plasticity profiles divided as evenly as possible 
for both AX and AY domains. This criterion is deter- 
mined by a statistic 



C = min{var(A;f ) + var(A:J) } . 



(15) 



Several optimization techniques, such as simulated 
annealing and genetic algorithms, have been available to 
solve the optimization task described in Equation (15). 
Supplementary Text S2 gives a procedure for uniformly 
dividing time-dependent expression plasticity for the two 
genes. After the time-dependent expression plasticity 
profiles are divided per Equation (15), we calculate three 
kinds of probabilities as follows: 



Pf = f 



P, 



X,Y 

Pt'e 



kj 
T 

k x ' Y 



(16) 



where T is the total number of time points as defined 
in Equation (12). These probabilities are then used to 
calculate the mutual information between the expression 
plasticity variables AX and AT by Equations (1), (5) and 
(9). The partition determined by Cin Equation (1 5) is called 
the common partition of expression plasticity variables AX 
and A Y. 



MUTUAL INFORMATION BETWEEN GROUPS AND 
ENVIRONMENTS 

In gene expression analysis, clustering is a first step 
towards studying gene function by subdividing the genes 
into a smaller number of categories and then comparing 
dissimilarities among the categories (23,27). In each 
category or group, there are a set of functionally similar 
genes. From the perspective of mutual information, we 
want to know whether the grouping result is reasonable 
and how the groups are related with each other. To solve 
these problems, the mutual information between and 
within groups should first be defined. 

For any two groups, G\ and G 2 , there are a number of 
genes with a similar dynamic expression plasticity trajec- 
tory. Let X and Y denote an arbitrary gene from groups 
G\ and G 2 , respectively. According to a common partition 
D for G\ and G 2 , the mutual information of expression 
plasticity AX and A Y between the two groups according 
to a common partition D is defined as 



I D {G U G 2 ) = 



1 



E A AX, AY), 



(17) 



zl AXeG, AYeG 2 



where |G[| and |G 2 | are the numbers of genes in G\ and G 2 , 
respectively. 

The calculation of the dependence of gene expression in 
response to different environments is based on the mutual 
information of environmentally induced expression plasti- 
city. There is an alternative to calculating such depend- 
ence, which is based on the mutual information of gene 
expression between two environments. Let X denote 
an arbitrary gene from a group G. In this group, this 
across-environment mutual information according to a 
common partition D is defined as 



(18) 



XeG 



where |G| is the number of genes in G; X L and X H are the 
expression profiles of a gene in environment L and H, 
respectively. 

Equation (17) provides a procedure for calculating the 
mutual information of dynamic expression plasticity 
between different groups of genes. The reconstruction of 
regulatory networks from dynamic expression plasticity 
trajectories can shed light on the mechanistic pattern of 
how genes respond differently to environmental change 
according to their biological function. Equation (18) can 
be used to study the dependence of the expression of 
individual genes between different environments. By 
accumulating all genes within groups, different groups 
can be compared for the extent of such dependence. 



RESULTS 

Working example 

In previous work by Wang et al. (23), a dynamic model 
was developed and used to identify unique groups of genes 
based on their differential response to the local environ- 
ment. Specifically, vein bypass grafts, exposed to either 
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high or low flow, were harvested at 2 h, 1 day, 7 days or 28 
days after implantation (20). Microarray analysis of 
14958 genes was used to define and cluster the temporal 
response of the transcriptional profile induced by the 
local flow environment. Wang et al.'s model identified 
eight groups, symbolized by A (0.0116), B (0.0123), 
C (0.3354), D (0.3831), E (0.1134), F (0.0359), 
G (0.0100) and H (0.0083), where the numbers in 
parentheses are the proportions of genes belonging to a 
particular group. These groups display different patterns 
of environment-induced changes in gene expression tra- 
jectories. Our mutual information approach was applied 
to reconstruct a regulatory network that encompassed 
the dynamics of gene expression. Our analysis was based 
on three scenarios: (i) reconstructing an overall network 
by jointly using time-series gene expression data from the 
two flow environments; (ii) reconstructing a network by 
using the expression plasticity between high and low flows; 
and (iii) reconstructing two networks by using time-series 
gene expression data separately for two flows. 

According to scenario (i), a sparse network of gene ex- 
pression was obtained (Figure 1), in which a few pairs of 
gene groups have regulatory connections. Of all pair-wise 
relationships, group A shares the highest level of mutual 
information with group H, followed by the level of mutual 
information between groups H and F, groups B and E, 
groups A and F, groups B and F and so forth. Several 
pairs of groups share very low mutual information. 
It appears that groups C, D and G are substantially dis- 
similar to the rest of the groups, with each of these groups 
only weakly connected with two other clusters. 

Scenario (ii) emphasizes the similarities of gene groups 
in terms of their pattern of differential expression over two 
different flows. Figure 2a provides a quantitative descrip- 
tion of the level of regulatory connections among eight 
gene groups identified by Wang et al.'s (23) dynamic 
model. Although many connections are observed, the 
levels of mutual information are highly variable. To 
respond to environmental changes from one flow to 




Figure 1. An overall regulatory network of eight groups of rabbit 
genes constructed by jointly using expression data from high and low 
flows. 



next, groups A and H, groups H and F, and groups F 
and D would adjust their expression profiles in a highly 
similar way. As such, we conclude that groups A, H, F and 
D share substantial overlapping information, compared 
with other clusters in the network. The significant 
overlap and network autonomy among these four 
groups is further underscored by the configuration of 
group D, which, except for weak connections with 
groups C and B, only demonstrates the dominant connec- 
tion to group F. 

To reconstruct the networks per scenario (iii), we 
calculated with the common partition D the mutual infor- 
mation of expression dynamics of genes X,- and Yj from 
two groups G\ and G 2 , respectively, in a particular envir- 
onment j using 




Mutual info, level betw. low Mutul Info level betw. 2 groups of 

& high flow within a group low (green) or high (red) flow 



Figure 2. Regulatory network of eight groups of rabbit genes expressed 
in high and low flows, (a) Between-group network reconstruction based 
on average value of expression between the two treatments, (b) Within- 
and between-group network reconstruction based on gene expression 
separately in low and high flows. The thickness of a circle represents 
the level of mutual information between two treatments within a group, 
whereas the thickness of lines represents the level of mutual informa- 
tion between two groups in low (green) and high flows (red). The lines 
representing mutual information below the average level are omitted. 
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This equation was used to calculate group-wise depend- 
ence separately for each different environment. It is inter- 
esting to see that the degree of dependence between groups 
is not identical for low and high flows (Figure 2b). For 
example, groups A is associated with groups C and D in 
the low flow, but this association does not occur in the 
high flow. Figure 2b provides a quantitative measure of 
the difference in the level of group-wise mutual informa- 
tion of gene expression between the two treatments. In 
addition, the amount of mutual information of the two 
treatments for the same group varies, depending on group 
type. Group G is most highly associated between low and 
high flows, followed by groups H, E and F. Within group 
D, the two flows are weakly associated. The results given 
in Figure 2 provide a comprehensive characterization of 
regulatory networks of genes related to vein graft re- 
modeling, which are expressed differently in response to 
low and high flow environments. 

Computer simulation 

The basic idea of using mutual information to reconstruct 
networks for genes expressed in a single environment has 
been available in the literature. Some studies critically 
analysed the advantages of information-based approaches 
over those based on classic Euclidean distance and 
Pearson correlation measures (1,15). Thus, we will not 
focus on methodological comparisons in this article, 
rather than on the investigation of the advantage of our 
information-based approach in studying gene expression 
plasticity. 

We simulated two data sets each of three equally sized 
groups of genes expressed in a time course. In the first data 
set, genes are measured in a single environment, whereas 
the second data set contains genes measured in two differ- 
ent treatments. Our model was used to analyse these 
two sets of data, having results to be in a good agreement 
with the actual case of each data set (Figure 3). However, 
it is impossible that a good result can be obtained for gene 
expression in two environments using a traditional single- 
environment approach. 

DISCUSSION 

Many biological processes including plant and animal de- 
velopment are coordinated by cell-to-cell communication 
regulated by genes (5). High-throughput measurement 
techniques have now led to the identification of tens of 
thousands of genes involved in sensing external cues. 
However, the dynamic interplay between genes is highly 
complex and cannot be understood by a simple approach 
(28). The reconstruction of gene regulatory networks can 
be a valuable tool for identifying the key mechanisms that 
shape the dynamics of cellular and transcriptional 
processes (6,29). 

External stimuli or agents can alter the speed and dir- 
ection of cellular processes through differential expression 
of the gene set. There exist specific mechanisms that 
shepherd the signal into the nucleus, where signal integra- 
tion occurs by complex transcription factor networks. 
In this article, we describe a procedure for quantitative 




Treatments not considered Treatments considered 



0 0.3 0.6 0.9 1.2 1.5 1.8 
Mutual information 

Figure 3. Regulatory network constructed from simulated data sets of 
genes expressed in a single environment (a) and two different environ- 
ments (b). 



modelling of biological regulatory networks regulated by 
gene expression using mutual information. Beyond classic 
correlation parameters, mutual information can measure 
and evaluate the non-linear dependencies of random vari- 
ables (12,14,24). We extended this information-based 
approach to assess and detect the non-linear dependencies 
of genes both between and within different gene groups of 
a particular function. 

Our model has combined two complexities of network 
reconstruction. First, although much previous work 
focuses on static (steady-state) gene regulation, improved 
biotechnologies have allowed the measures of dynamic 
gene expression data during a biological process. The 
availability of dynamic data enables geneticists to better 
study the regulatory machineries underlying cellular 
processes (2) but, meanwhile, brings about a difficulty 
in analysing and interpreting expression data. Second, 
as gene expression is environment dependent (5), the 
reconstruction of regulatory networks by integrating 
environmental impact is crucial. By taking into account 
dynamic and environment-dependent complexities of gene 
expression, our model allows the reconstruction of more 
mechanistic and, therefore, more powerful regulatory 
networks. 

The new model based on mutual information can effect- 
ively handle any dynamic relationships of genes, linear or 
non-linear, a characteristic better than classic Euclidean 
distance and Pearson correlation measures (15), and 
thereby should be able to find its broader application in 
computational biology. The model was used to analyse a 
time-series data set of gene expression measured for vein 
bypass grafts subjected to two distinct conditions, high 
and low blood flow, leading to the construction of 
genetic network that connects different groups of genes 
with different response trajectories to the local environ- 
ment (23). The model can quantify the mutual dynamic 
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relationships of different genes in terms of their differen- 
tial expression to environmental change. The model was 
validated through computer simulation, showing its prac- 
tical usefulness. In practice, when the number of genes 
is large, some inference procedure for selecting important 
groups, such as some permutation procedure, may be 
helpful and can be implemented. 

There is much room for the model to be improved. 
First, our model assumes a normal distribution of gene 
expression, which is reasonable for microarray data. 
However, an increasing body of expression data is being 
collected by high throughput cDNA sequencing (RNA- 
Seq). The current model will need to be modified to ac- 
commodate the feature of Poisson distribution, which 
characterizes the data obtain from RNA-Seq (30). 
Second, the ultimate goal of network construction is to 
identify key genes or elements that can determine or 
alter the behaviour of an outcome, such as the critical 
stenosis that leads to vein bypass graft failure. Thus, the 
incorporation of outcome variables into the network and 
the estimation of direct or indirect effects of each gene on 
the outcome are essential for mechanistic characterization. 
Third, it is likely that the regulation of gene elements 
is under global genetic control (31). The integration of 
mutual information into genetic mapping will provide a 
powerful means of identifying expression quantitative trait 
loci that control regulatory networks. The characteriza- 
tion of expression quantitative trait loci will enable gen- 
eticists to gain a better understanding of the aetiology 
underlying complex traits or diseases. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Texts 1 and 2. 
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